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Abstract 

Background: With about half a billion cases, of which nearly one million fatal ones, malaria constitutes one of the 
major infectious diseases worldwide. A recently revived effort to eliminate the disease also focuses on IT resources 
for its efficient control, which prominently includes the control of the mosquito vectors that transmit the 
Plasmodium pathogens. As part of this effort, IDOMAL has been developed and it is continually being updated. 

Findings: In addition to the improvement of IDOMAL's structure and the correction of some inaccuracies, there 
were some major subdomain additions such as a section on natural products and remedies, and the import, from 
other, higher order ontologies, of several terms, which were merged with IDOMAL terms. Effort was put on 
rendering IDOMAL fully compatible as an extension of IDO, the Infectious Disease Ontology. The reason for the 
difficulties in fully reaching that target were the inherent differences between vector-borne diseases and "classical" 
infectious diseases, which make it necessary to specifically adjust the ontology's architecture in order to comprise 
vectors and their populations. 

Conclusions: In addition to a higher coverage of domain-specific terms and optimizing its usage by databases and 
decision-support systems, the new version of IDOMAL described here allows for more cross-talk between it and 
other ontologies, and in particular IDO. The malaria ontology is available for downloading at the OBO Foundry 
(http://www.obofoundry.org/cgi-bin/detail.cgi?id=malaria_ontology) and the NCBO BioPortal (http://bioportal. 
bioontology.org/ontologies/131 1). 
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Findings 

Background 

Although eradicated from most of the non-tropical re- 
gions of the world since decades, malaria is still being 
considered as one of the major scourges of mankind, af- 
fecting hundreds of millions of people in the tropical re- 
gions of the world [1]. Recent years have witnessed a 
revival of the idea of eradicating the disease, although 
this time the prevailing goal is that of elimination, rather 
than to completely expunge it [2,3]. To achieve this ob- 
jective emphasis has to be put on disease control, aiming 
at both the disease as such (prevention, diagnosis and 
treatment) and, most importantly, at vector control [1]. 
On both fronts, different measures have to be chosen 
and actions such as vaccine and novel antimalarial drug 
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development, innovative strategies for vector control 
and vector population monitoring, etc., have to be prom- 
inently assisted by approaches based on Information 
Technology (IT). It becomes clear that there is a need 
for new effective tools that will be able to combine dif- 
ferent, yet related datasets covering various aspects of 
disease (e.g. epidemiological and entomological data, 
intervention efforts, etc.). These tools encompass re- 
sources such as smart databases (including decision sup- 
port systems), enhanced bioinformatics software and 
usage of technologies such as the Internet and mobile 
telephony for the fast transfer of data. The latter is espe- 
cially crucial, given that malaria usually strikes the 
world's poorest areas, in countries in which general in- 
frastructures are often under-developed. 

It is now established that ontologies help overcome 
several difficulties encountered in the wide usage of IT 
resources by achieving enhanced interoperability. This is 
the reason why we decided to put emphasis on the de- 
velopment of ontologies that cover the domains of both 
vector borne diseases, including malaria, and the vectors 
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that transmit their pathogens. We have therefore devel- 
oped a series of ontologies [4] that describe the anatomy 
of mosquitoes and ticks, mosquito insecticide resistance, 
as well as malaria as the first disease in this category. It 
was early on decided that the latter ontology, IDOMAL 
[5], would be built as an extension to IDO, the Infec- 
tious Disease Ontology [6]. The first version of 
IDOMAL was made publicly available three years ago, at 
a time when IDO was still at a very early stage of devel- 
opment. This "wrong order" obviously led to some dis- 
crepancies between the two ontologies, which would 
have to be reconciled if IDOMAL is to be considered a 
bona fide extension of IDO. It is particularly important 
to achieve the status as early as possible, since we are in 
the process of developing additional ontologies in the 
domain of vector borne diseases such as, for example, 
IDODEN, a yet unpublished ontology for Dengue fever 
[7]. IDODEN follows the same architecture as IDOMAL, 
something that proved to be extremely efficient in terms 
of the ontology design. We therefore decided to restruc- 
ture IDOMAL at this point in order to avoid major fu- 
ture "repair" work on several ontologies. Here, we 
describe both the changes introduced in IDOMAL for 
this purpose, as well as several other changes that have 
been made. 

Updating IDOMAL 

All significant changes introduced to IDOMAL are indi- 
cated in Table 1. In total 222 terms were added, while 
another 207 changes of different nature were performed. 
All terms and relations that are included in the ontol- 
ogies described here are in italics. The first piece of up- 
date is the conversion of IDOMAL from the OBO 
format to OWL. The advantages and shortcomings of 
the one versus the other format are not to be discussed 
here. Given, though, the increased spread of the OWL 
format among bio-ontologies we decided to proceed 
with the conversion, at the same time making both the 
OBO and the OWL formatted versions available to the 
community. We used the OBOtoOWL script that we 
previously described [8] to achieve the conversion. 



The next change introduced to IDOMAL could be 
called technical since it concerned the editing of the 
ontology in order to make sure that all terms of the 
ontology have complete is_a paths to the root, which, 
now, in accordance to BFO (Basic Formal Ontology) 
[9-11] is entity. Therefore the term role which was one 
of many root-level terms in previous versions, now took 
its proper place and is to be found under realizable en- 
tity, which is a specifically dependent continuant, which 
is a dependent continuant, which is a continuant, which 
is an entity - the root term of BFO. Furthermore, condi- 
tion, which was also a top-level term, is now a sibling of 
role, in accordance to the BFO 1.0 specification [9-11]. 
Finally, we corrected typos that were found, and added a 
small number of terms that had either missed our atten- 
tion or their inclusion was deemed necessary given the 
latest developments in malaria research. 

A further major addition to IDOMAL concerns the in- 
clusion of terms from the sub-domain of natural remed- 
ies and medicinal plants. Chemotherapy of malaria has 
been increasingly hindered by the development of resist- 
ance of Plasmodium parasites against antimalarial drugs 
[12]. The search for novel antimalarials, therefore, has 
now also turned its attention to traditional remedies, in 
particular natural products derived from plants. It 
should be stressed that, currently, one such product and 
its derivatives (artemisinins), against which widespread 
resistance had not been developed, are now under such 
risk [13]. To cover the domain we used, mostly, a hand- 
book [14] that fully covers the domain. We should 
underline the fact that we concentrated, obviously, on 
terms that described substances and procedures for 
which a certain degree of efficacy had been previously 
shown. Similar to what is true for "conventional" anti- 
malarials, we do not consider the terms included as 
complete, and we are planning to perform more addi- 
tions if necessary. 

In addition to this important addition we also decided 
to perform a wide "exchange" of terms. The decision 
was taken based on the idea that in the now rapidly 
expanding field of bio-ontologies the same term is often 



Table 1 List of changes and corrections introduced in IDOMAL 



Update element 

Non availability of IDOMAL in OWL Format 
Several terms lacked "is_a" paths to top 
Multiple (1 1) root terms 
Availability of terms in other ontologies 
"Missing" terms 

Absence of terms relating to natural products/remedies 
Minor problems (typos, individual terms, definitions) 
Discrepancies with IDO 



Action 

Used OBOtoOWL script to generate 
Completed "is_a" paths for all terms 
Fully adopted BFO hierarchy to top-level 
Imported (merged) terms from other ontologies 
146 terms added 

76 terms added (included in those above) 
Edited/ completed 



Comments 

Both versions publically available 
entity is now the single root term. 



See Table 2 for details 
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defined differently, and is also linked with is_a relations 
to terms that are different. For example, querying the 
Bioportal [15] one finds that the term symptom is now 
described in 15 distinct ontologies. Interestingly, IDO in- 
cludes symptom as a quality and has no children terms, 
the Ontology for General Medical Science (OGMS) [16] 
has entity as its parent and only lists one child (pain) 
while, finally, the Influenza ontology (FLU) [17] also has 
entity as the parent of symptom (also imported from 
OGMS) and has 8 children listed that, though, do not 
include pain. It should be noted that FLU is an exten- 
sion of IDO. In IDOMAL, symptom (with a large num- 
ber of children) is linked to condition of the malaria 
host, which is obviously a child of condition. We decided 
to replace all children of symptom with those listed by 
the Symptom Ontology (SYMP) [18], in which symptom 
is the root. By term replacing we actually mean the mer- 
ging of the terms from SYMP to those in IDOMAL. 
Merging, instead of replacing leaves both IDs intact 
within IDOMAL, and therefore if somebody has already 
been using IDOMAL there will be no need to perform 
any changes in the software that uses the ontology. 
The choice of SYMP was made purely on the fact 
that alternatives such as OGMS do not list the terms 
that we needed. Finally, we should state at this point that 
IDO is using the symptom term imported from OGMS 
(see below). 

IDOMAL and IDO 

As mentioned earlier, due to the timing of development 
of the two ontologies the published version of IDOMAL 
has some features that make it difficult, as such, to be 
called an extension of IDO in the latter 's present form. 
The example stated above, i.e. the term symptom being 
imported from two different ontologies, exemplifies this 
problem while, at the same time, it also shows that the 
differences are not necessarily irreconcilable: the easiest 
solution for this kind of discrepancy would be to simply 
merge the two terms. There are several more examples 
of how some differences may be eliminated. Table 2 lists 
these, showing in addition the actions taken or to be 
taken. For example, while antiparasitic drug is a role in 
IDOMAL, a term antiparasitic disposition is found in 
IDO, defining antiparasitic material entity as entity, 
which bears antiparasitic disposition (IDO contains dis- 
positions such as antibacterial, antifungal, antiparasitic, 
antimicrobial, antiviral). We could easily reconcile the 
difference by accepting that antimalarial drug in 
IDOMAL is a role borne by a material entity which has 
antiparasitic disposition and is given to a patient to treat 
malaria. This would follow the example set in IDO by 
antiseptic role (definition: A role borne by a material en- 
tity in virtue of the fact that it has an antimicrobial dis- 
position and is applied to an anatomical entity of a living 



organism). There are a few more cases of discrepancy 
between the two ontologies and for some of them we 
have decided to adopt the IDO point of view. For ex- 
ample, endemicity that was a disposition in IDOMAL 
will be changed to quality of a population, and so 
will be the terms holoendemicity, hypoendemicity and 
mesoendemicity, which are absent from IDO. We have 
also changed resistance from quality to disposition; al- 
though a good case was made for the fact that resistance 
is a disposition [19], we should nevertheless state here 
that resistance is in most, if not all cases a genetic 
phenotype. And without going into further discussions, 
we simply state that phenotypes are usually considered 
to be quality, possibly because of the fact that several 
of them are visible (e.g. white eyes, ectopic expres- 
sion, etc.). Finally, habitat, a spatial region so far in 
IDOMAL, has been changed to site like in IDO. 
What also had to be changed to fit the present onto- 
logical representation is to define breeding site as a 
role carried by material entity (i.e. site). 

In spite of the changes made, a series of issues remain 
that haven't yet been resolved. Some of them, in our 
opinion, are relatively secondary and they could be re- 
solved easily. For example, terms such as treatment and 
refractoriness which we deem to be necessary for an 
ontology of vector borne diseases could be carried again 
by IDO, from which they were obsoleted some time ago. 

The remaining open issues are due to the distinctive 
properties of vector borne diseases. These infections are 
characterized by the fact that they arise through the bio- 
logical interactions between three organisms (patient/ 
host, vector and pathogen), rather than only two as is 
common in the vast majority of infectious diseases. 
Thus, an ontology such as IDOMAL has to capture all 
three organisms, as well as prominently include terms 
on the respective populations. For example, control of 
malaria, eventually leading, perhaps, to its elimination, is 
predominantly based on vector control. This involves 
measures aimed, for example, at reducing mosquito pop- 
ulations, possibly using genetic approaches [20] or, as 
may be the case in the future if planned strategies suc- 
ceed, at replacing vector populations with others that 
will simply not be able to transmit the pathogen [21]. 

Both IDO and IDOMAL use the BFO [9-11] as an 
upper level ontology. IDO would then describe the 
infectious disease domain, while IDOMAL would ideally 
be placed below it. The current structure of IDO, 
though, does not allow for a full deployment of the mal- 
aria domain. IDOMAL has separated several classes of 
terms in three main "groups" namely the patient/host, 
the vector and the pathogen or, to be more precise, into 
six groups since populations are "treated" separately. 
The reason for this is obvious: many terms apply to both 
patient/host and vector (both being metazoan). Similarly, 
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Table 2 Differences between IDO and IDOMAL and steps (to be) taken to unify the ontologies 



IDO-IDOMAL "comparison" 



Action 



Comments 



DO: symptom imported from OGMS, no children 

DOMAL: native term, 39 children 

DO: endemicity is quality of a population 

DOMAL: endemicity is disposition 

DO: habitat is site (material entity) 

DOMAL: habitat is spatial region 

DO: breeding site absent 

DOMAL: Anopheles breeding site is a role 



DO: epidemiological types [of an infectious 
disease] absent 

DOMAL: epidemiological types of malaria present 

DO: antiparasitic material entity is material entity 
bearing antiparasitic disposition 

DOMAL: antiparasitic drug is a role. 



DO: resistance is a disposition 

DOMAL: resistance is quality (of (vector population) 

DO: infection is a material entity 

DOMAL: infection is not present; infectious disposition 
is not present 

DO: vector competence is not present 

DOMAL: vector competence is present 

DO: infectious disease control is absent 

DOMAL: malaria control is present 

DO: No terms for decision support systems, [disease] 
and vector control, intervention methods 

DOMAL: Corresponding terms present 

DO: refractoriness is obsolete term 

DOMAL: refractoriness is present 

DO: treatment is obsolete term in IDO 

DOMAL: treatment is present 

DO: holoendemicity I hypoendemicity I mesoendemicity 
are missing 

DOMAL: terms are present 

DO: quality of infectious disease is absent 

DOMAL: quality of malaria is present 

DO: process of infectious disease is absent 

DOMAL: process of malaria is present 

DO: no differentiation between host/patient and 
pathogen 

DOMAL: differentiates between host/patient, vector 
and pathogen 



Imported/merged terms from 
Symptom Ontology 



Changed in IDOMAL, now as in IDO 



Changed in IDOMAL, now as in IDO 



Changed to Anopheles breeding 
site role born by site (material entity) 



Discuss with IDO developers 



Needed in IDOMAL 



Issue solved in IDOMAL: antimalarial Similar for antibiotic role/ disposition, etc. 
drug is a role born by a material entity 
which has antiparasitic disposition 



Changed in IDOMAL, now as in IDO 



No action needed 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Discuss with IDO developers 



Terms are not needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL 



Needed in IDOMAL, acts as a placeholder 
for terms relating to malaria-specific 
interventions. 



Needed in IDOMAL, coverage of physiologica 
and pathophysiological processes occurring 
in host and/or vector and/or parasite. 
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the latter separation, of course, is due to the fact that 
several terms are specific for populations, rather than in- 
dividuals; this is especially true for vector control. 

Another problem that is not yet resolved is how to 
list in a grouped, ontologically correct form terms 
such as pathogen specific form of malaria and epidemio- 
logical type of malaria. A solution could be to create 
sub-ontologies for each one of the different forms of 
malaria (i.e. for P. falciparum, P. malariae, P. ovale and 
P. vivax); we consider this to be impractical in several 
obvious aspects. For the time being we keep the prob- 
lematic class quality of malaria and we'll aim at finding 
an appropriate way to describe these features in collab- 
oration with the IDO consortium. The class process of 
malaria, thus, groups a series of physiological and 
pathophysiological processes occurring in the patient/ 
host and/or the parasite. Finally, IDOMAL has no place- 
holders for vector-specific processes (e.g. host seeking) 
or qualities (e.g. vector competence). Of course, all of 
these terms could be listed as direct children of process 
and quality, but we think that a more detailed classifi- 
cation would benefit the users of the ontology and, 
especially, would make it easier to design other ontol- 
ogies for vector-borne diseases. A similar consideration 
is valid for malaria prevention and vector control, terms 
that need to be included, and are now under the "place 
holder" process of malaria. It should be noted here that 
recently the Vector Surveillance and Management 
Ontology (VSMO) was published that covers the do- 
main of vector control [22]. This development may 
make it easier to find a partial solution to the last men- 
tioned problem. 

Conclusions 

It was not unexpected that IDOMAL had to undergo 
several updates, partial revisions and expansion during 
the three years after it was published, which all are sum- 
marized in this report. Not only is it legitimate to always 
try to obtain a better "product", but also some of the 
changes are dictated by the needs of the community (e.g. 
remedies and natural products) or recent developments 
in the field. As mentioned in the beginning, improved 
IT tools are becoming indispensable, especially as high 
throughput technology develops and provides more data. 
In the case of malaria and other vector borne diseases, 
this evolution is obvious. Only about ten years after the 
determination of the genome sequence of Anopheles 
gambiae [23] tens of genomes of different vectors have 
become available [24]. Although so far genes are usually 
only annotated with GO terms [25], the day is not 
far when they, and other data in genomic/biological 
databases, will also be annotated with ontological 
terms describing these domains, such as, for example, 
VectorBase, the database that covers arthropod disease 



vectors [26]. Moreover novel IT tools such as decision 
support systems are already making use of ontologies 
[27] and, even more, tools are planned that will be able 
to direct information to and from ontologies and data 
holders [28]. In the domain of vector-borne diseases, 
IDOMAL and MIRO, an ontology of Mosquito Insecti- 
cide Resistance [29] that has now been fully integrated 
in IDOMAL, are used by newly developed Decision Sup- 
port Systems for vector-borne diseases [30,31]. Further- 
more, VSMO also uses a several terms that have been 
imported from IDOMAL [22]. It becomes clear that the 
availability of all the new, open bio-medical ontologies 
provides ways to achieve enhanced interoperability be- 
tween databases and to expand the title of the original 
publication of the Gene Ontology [32] to "tools for the 
unification of bio-medical sciences". 

N.B. Both IDOMAL versions are available for 
downloading: the OBO version is at the OBO Foundry 
and at the NCBO BioPortal, while the OWL version is avail- 
able at: http://anobase.vectorbase.org/idomal/. 
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